Modification Operation Buffering: A Low-Overhead Approach to Checkpoint User Files
نویسنده
چکیده
Checkpointing and recovery is a technique for saving process state during normal execution and restoring the saved state after a failure to reduce the amount of lost work. One of its important capabilities is to save and restore the state of user files of the process. This paper presents an approach called Modification Operation Buffering to support this capability. MOB approach buffers all the modification operations after a checkpoint until the next one, making all the operations between two checkpoints atomic as a whole. By choosing a suitable size dynamically for memory buffer, and by hiding the latency of flushing the buffer, the MOB approach achieves an overhead lower than other approaches.
منابع مشابه
Design and Implementation of a Low-Overhead File Checkpointing Approach
One of checkpointing and recovery technique’s important capabilities is file checkpointing, i.e., to save and restore the state of user files of the process. This paper describes the design and implementation of a file checkpointing approach called Modification Operation Buffering. This approach buffers all the modification operations after a checkpoint until the next one, making all the operat...
متن کاملThe Consistent File-Status in a User-Triggered Checkpointing Approach
The user-triggered checkpointing tool implements a non-blocking, co-ordinated (global) checkpointing method, where the programmer defines the contents and the position of the recovery-line. Within this tool, we developed and implemented file-checkpointing. This allows to include the status of files into the checkpoints and to restore this status when the application is rolled back. The file-sta...
متن کاملFast Communication Mechanisms - Coupling Hardware Distributed Shared Memory and User-Level Messaging
Low latencies for small messages are an important factor of efficient fine-grained parallel computation. The Active Messages concept provides this minimal overhead by eliminating certain parts of the critical path of sending and receiving messages, that is the context switch into the operating system kernel when using user-mode I/O, and multiple buffering in the network layer. Hardware-supporte...
متن کاملA Low Overhead Recovery Technique Using Quasi-Synchronous Checkpointing
In this paper, we propose a quasi-synchronous checkpointing algorithm and a low-overhead recovery algorithm based on it. The checkpointing algorithm preserves process autonomy by allowing them to take checkpoints asynchronously and uses communication-induced checkpoint coordination for the progression of the recovery line which helps bound rollback propagation during a recovery. Thus, it has th...
متن کاملEfficient User-Level Thread Migration and Checkpointing on Windows NT Clusters
ion of running on a single shared memory multiprocessor, Brazos supports message passing by implementing the MPI library [20]. Thread migration in the context of a distributed system involves the movement of a computation thread from one currently executing process to another running process. Thread migration has been previously proposed as a tool for load-balancing and communication reduction ...
متن کامل